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The High Altitude Water Cherenkov (HAWC) gamma-ray observatory is located at an altitude 
of 4100 meters in Sierra Negra, Puebla, Mexico. HAWC is an air shower array of 300 water 
Cherenkov detectors (WCD’s), each with 4 photomultiplier tubes (PMTs). Because the obser¬ 
vatory is sensitive to air showers produced by cosmic rays and gamma rays, one of the main 
tasks in the analysis of gamma-ray sources is gamma/hadron separation for the suppression of the 
cosmic-ray background. Currently, HAWC uses a method called compactness for the separation. 
This method divides the data into 10 bins that depend on the number of PMTs in each event, and 
each bin has its own value cut. In this work we present a new method which depends contin¬ 
uously on the number of PMTs in the event instead of binning, and therefore uses a single cut 
for gamma/hadron separation. The method uses a Feedforward Multilayer Perceptron net (MLP) 
fed with five characteristics of the air shower to create a single output value. We used simu¬ 
lated cosmic-ray and gamma-ray events to find the optimal cut and then applied the technique to 
data from the Crab Nebula. This new method is tuned on MC and predicts better gamma/hadron 
separation than the existing one. Preliminary tests on the Crab data are consistent with such an 
improvement, but in future work it needs to be compared with the full implementation of com¬ 
pactness with selection criteria tuned for each of the data bins. 
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1. Introduction 

The High Altitude Water Cherenkov (HAWC) gamma-ray observatory is composed of 300 water 
Cherenkov detector (WCD). On the bottom of each WCD there are 4 photomultiplier tubes (PMTs) 
that detect the Cherenkov light. This light is produced by secondary particles in air shower gen¬ 
erated by the interaction between atmosphere and primary particle (as for example gammas rays, 
protons, among other particles). The rate of cosmic rays (CR) is bigger than the gamma rays (GR) 
so it is critical to find a technique to remove the CR without losing the signals of GR. 

Currently, HAWC has a method called compactness for distinguishing those primary particles. 
For doing this, the data is divided into 10 bins (see Table 1) depending on nHit, that is the number 
of PMTs that have a signal in the event. The compactness depends upon the charge distribution de¬ 
posited by the secondary particles of the shower on PMTs of the array. In this work, a new method 
is presented, using a Neural Network (NN) for the gamma/hadron separation without dividing the 
data into bins. Five characteristics are computed for feeding a NN that computes a value (6^^) to 
distinguish between CR and GR. Another method in development can be found in [1]. 

2. Training stage 

The NN used in this work is a Feedforward Multilayer Perceptron [2]. For a correct evaluation, 
the NN must pass two stages, training and testing. In the training stage the aim is to minimize the 
classification error. First, the values of chai'acteristic input are calculated and a training MC data set 
is selected. The architecture is defined as 5-5-5-1 (Figure la), fhe firsl layer has 5 neurons because 
fhe NN need 5 characferisfics as inpuf\ one neuron in fhe lasf layer because fhe nefwork needs fo 
recognize only fwo fypes of particle. Differenl archifecfures of NN were fesfed buf fhe learning 
curves were similar. In fhe use of NN fhe recommended number of fofal layers should be A — 1 
where N is fhe number of inpuf variables [3], in our case A = 5 so fhe simple sfrucfure (5-5-5-1) 
was chosen fo save computing time. The learning mefhod used was sfochasfic minimizafion and 
look 500 epochs for a asympfofic behavior in fhe error of fhe oufpuf. 

In Figure lb is shown fhe histogram of fhe oufpuf for fhe NN. The majorify of fhe evenfs produced 
by GR are close fo value 1 and CR fo 0. Finding fhe opfimal cuf in fhis variable will allow us fo 
separate befween differenl fypes of primary parlicles. This Ihreshold value is defined as O^n- 

The Q factor is defined as Egamma / s/^hadwn where Egamma is the fraction of gamma events 
that are classified correcfly, also called gamma efficiency, and Ehadmn is the hadron events that are 
classified as gamma evenfs, also called hadron efficiency. The Q value esfimafes fhe facfor by 
which fhe significance will be increased by fhe classification. Figure 2a shows fhe Q facfor and 
fhe Onn value, where if can be seen fhal fhe highesf value of Q corresponds fo a value around 
Onn = 0.98. The receiver operating characferisfic (ROC) curve is useful for comparing classifiers 
and visualizing fheir performance [4]. From fhe ROC curve we can see fhal by using Omn = 0.96 
we increase fhe gamma efficiency, even if we have a bil lower Q Facfor wilh fhis cuf (see Table 2). 


'Each input is normalized with respect to maximum value of each feature. 
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bin 

nHit min 

nHit max 

dc 

-1 

30 

54 

- 

0 

55 

87 

4.6 

1 

88 

138 

6.3 

2 

139 

216 

9.8 

3 

217 

323 

12.7 

4 

324 

457 

17.6 

5 

458 

606 

19.5 

6 

607 

754 

18.5 

7 

755 

889 

17.1 

8 

890 

1000 

15.0 

9 

1001 

1200 

12.4 


Table 1: nHit range and gamma/hadron cut in each bin for HAWC-300, 9c is the compactness cut value. 



Figure 1: In (a) is shown the architecture of NN with 5 neurons as inputs, two hidden layers with 5 neurons 
and one neuron as output. The width of each connection line between neurons is proportional to the weight 
of the NN. In (b) is shown the outputs of the NN for gammas and hadrons in the learning stage. The majority 
of gamma events have an output close to one, and protons are close to 0. 

2.1 Choice of characteristic inputs 

The main idea is to use the morphological differences of the charge distribution in the PMTs for 
the two type of primary particles. In event produced by gammas the PMTs close to the core of 
the shower have the biggest signals and the charge distribution is characterized by a compact and 
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Figure 2: In (a) is shown the Q Factor of NN’s outputs. The largest Q factor is at 4.76 when the output 
threshold is around 0.98. In (b) are shown the ROC for the NN. The O^n corresponding to the Egamma 
between 0.6 and 0.7 could been used, at a loss of some Q value. 


^NN 

^gamma 

^hadron 

Q Eactor 

0.94 

0.713 

0.028 

4.309 

0.96 

0.666 

0.024 

4.424 

0.98 

0.604 

0.019 

4.761 

1.00 

0.495 

0.011 

4.160 

1.02 

0.306 

0.005 

2.787 


Table 2: Values for gamma and hadron efficiency close to the maximum value of Q factor. Here, for 
completeness, we include the bin -1 from Table 1, even thought the bin is not used in the compactness 
analysis. 


smooth profile. But in the case of hadrons, PMTs with high charge can be far away from the core 
and the charge distribution is not compact. 

• The first feature we include is the number of PMTs with at least one photoelectron (PE) 
because it is directly related to the energy of the primary. We need our NN to distinguish 
independently of the energy of the CR or GR. This replaces the nHit binning used with the 
compactness cut (Pl=nHit). 

• DisMax (P2) that is the largest distance between any of the pair of tubes passing the next 
selection: first all the PMTs in the event are sorted by their PEs detected and we summed this 
value for each PMT from higher to lower until the sum is less that {SumPE —MaxPE)k{nHit), 
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where MaxPE is the number of PEs in any PMT in the event, and "k" is a factor that depends 
linearly of nHit, the PMTs involved in that sum are the selected ones. This input involves the 
distance of the PMTs with biggest charge detected and its distance because we suppose that 
for gammas all the PMTs with high PE are neighbors and this DisMax should be small. 

• P3 this feature is associated with the integral of the radial density where the hadron shower 
dominates gamma shower [5] defined as: 

where RpE, > 30 m 

Here PE, is the charge in the PMT,, is the distance in meters between the PMT, and 
position of the reconstructed shower center (core). 

• P4 is defined as CxPE^q/M axPE, where CxPE^q is fhe maximum charge oufside a exclusion 
radius of 30 m in fhe even!. Eor profons one expecfs fo oflen see charge localized high charge 
deposifion far from fhe core, so P4 can approach 1 for profons. On fhe ofher hand, gammas 
usually have a value near 0 because mosf of fheir charge is deposifed near fhe core. 

• P5 is relafed fo fhe difference befween fhe maximum charge oufside and inside fhe exclusion 
radius weighfed wifh fhe disfance fo fhe core. 


P5 = LogioilCxPEso * PcxPEio PEmaxint * ^PE^axim I) 

where RcxPEiq > 30 m and Rpe„^,^i„ < 30 m 


2.2 Training data set 

The simulated events were generated by using CORSIKA program in the energy range [0.005,100] 
TeV with a flat spectrum and zenith angle [0,75]". The performance and response of the array were 
computed using the HAWC official software. 

Eor the training stage, the network need two data sets, one for gamma and other for hadron. We 
defined a target value of 1 for gamma ray event and 0 for hadron event. In this work we only use 
protons as hadrons because protons constitute nearly 99% of the CR. The conditions for selecting 
training the events for each set are: 

• The event is well reconstructed. 

• The difference between the core reconstruction and simulation does exceed 5 m. 

• The core falls inside the HAWC array 

• Event with nHit between 30 and 1200. 
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Figure 3: The Q factor is calculated for each bin and the total (bin 0 to 9) with 9nn = 0.96. This shows that 
for the Q factor in some bins, the NN is better than compactness but for others does not. Using the total bin 
we got 60% and 53% in gamma efficiency for NN and compactness respectively. 


3. Testing stage 

3.1 Simulation 

In this stage we use the same criteria described above for selecting the events for the training 
data set which consists of new simulated events independent of the training set. For comparing 
the two methods we use events with 55 < nHit < 1200, that correspond to bin 0 up to bin 9, 
i.e. we are not using the bin —1. In this comparison we will simply weight all events equally, 
without the optimal weighting for events in each bin used in [6] the Crab analysis. However, we 
do apply the compactness cuts of Table 1 for each nHit bin to compare performance of the NN and 
compactness. The bin called "total" is computed using all events from bin 0 to bin 9. The results 
are shown in Figure 3 where we can see that for the Q Factor the NN has a better result than using 
the compactness method. 

The total value of Q Factor, gamma efficiency and hadron efficiency of each separation mefhods 
(compacfness and NN) is shown in fhe Table 3. The NN improves on fhe compacfness mefhod. 
The gamma efficiency increased by 13% and fhe hadron efficiency decreased 30%, so fhe Q facfor 
increased by 35%. 

3.2 Data 

Anofher way fo compare fhe differenf performance of fhe compacfness and fhe NN is using 
HAWC dafa. We chose a sef of well reconstructed events within ±6° of the Crab Nebula. We have 
two methods (NKG and Gauss) for reconstructing the core position, but only Gauss was used in 
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Parameter 

NN 

compacfness 

Increase (%) 

Q Factor 

4.663 

3.432 

35.889 

gamma efficiency 

0.606 

0.536 

13.129 

hadron efficiency 

0.017 

0.024 

-30.693 


Table 3: Difference between methods with simulation. 


dc 

NKG 

Gauss 

10.0 

3.4706 

4.4649 

12.0 

4.3142 

4.4703 

14.0 

5.2777 

4.6895 

16.0 

3.9327 

4.3406 

18.0 

4.3170 

4.3613 


Table 4: Significance using the compactness variable with a single cut value for all bins. 


6nn 

NKG 

Gauss 

0.92 

5.8842 

4.9889 

0.94 

5.7042 

5.4144 

0.96 

5.9217 

5.5096 

0.98 

3.7534 

4.6703 

1.00 

4.0977 

3.1792 


Table 5: Signihcance using NN Vs NN threshold. 


training the NN. A well-behaved event should have a similar core position for either method. In 
the case of using compactness we use a very simple analysis [7] and apply a cut of 6c that varies 
from 10 to 18 but is applied to all nHit bins. For technical reasons we were not able to apply the 
bin-dependent cuts of Table 1 to the Crab data, so this constitutes a preliminary comparison of NN 
and compactness on the Crab data. The results are shown in Table 4. In the case of NN method, 
the maps are obtained by varying 6^^ from 0.92 to 1.0 (see Table 5). 

The highest values of significance from Tables 4 and 5 are placed in Table 6 and the increase 
with respect to the compactness method is computed. The results show that the NN is better than 
compactness in this preliminary comparison, consistent with expectations from MC. With the NKG 
method, the increase is 12 %, and with Gauss method is 17 %, not surprising since the NN learnt 
with events whose core reconstruction was done with Gauss method. 

4. Conclusion 

In this work, we propose a new method for gamma/hadron separation that used a Multilayer 
Perceptron fed with 5 characteristics. The NN’s output is continuous and has a value targeting 1 
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Method 

NKG 

Gauss 

compactness 

5.2777 

4.6895 

NN 

5.9217 

5.5096 

Increase (%) 

12.202 

17.488 


Table 6: Difference between methods with data. 


for gamma events and 0 for hadron events. In the analysis, we found an optimal eut value for the 
NN output = 0.96. With this value the NN has better performanee than compactness. The 
Q Factor increases approximately 36%, because the gamma efficiency increased about 13% and a 
decrease of 30% in hadron efficiency. 

In the case of Crab data we also obtained a better significance using NN instead of a simplified 
version of compactness where the compactness cut was constrained to be the same for all nHit bins. 
In future work we will compare with the full compactness implementation. 
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